-
Notifications
You must be signed in to change notification settings - Fork 426
Block schema field drop if it is reference by an active partition or sort field #2410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d4928bd to
81732d4
Compare
Anton-Tarazi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, left a minor comment :)
81732d4 to
7e47009
Compare
|
@Anton-Tarazi thanks for the review (this PR went off my radar) I have rebased and applied recommended changes |
geruh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some suggestions to get this moving for the 0.11 release. These fix the two bugs inverted allow_missing_fields logic, and string interpolation, also, we simplify the code by using existing schema._lazy_id_to_parent
7e47009 to
0a7a39e
Compare
|
Thanks for the review @geruh applied the comments |
geruh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gabeiglio this LGTM!!
kevinjqliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! thanks for the pr
couple of nits but not a blocker
| path = "/".join([field_str + "=" + value_str for field_str, value_str in zip(field_strs, value_strs, strict=True)]) | ||
| return path | ||
|
|
||
| def check_compatible(self, schema: Schema, allow_missing_fields: bool = False) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: doesnt look like allow_missing_fields is used anywhere, should we still keep it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check_compatible in Java its only used with allowMissingField=true (ctx) when reading metadata tables. Since here we only use this check (for now) at write path only, I think we could remove this field.
5d62d54 to
2a65eb7
Compare
2a65eb7 to
e12f66e
Compare
kevinjqliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the fixes! i think we're almost there
| while parent_id: | ||
| parent_type = schema.find_type(parent_id) | ||
| if not parent_type.is_struct: | ||
| raise ValidationError(f"Invalid partition field parent: {parent_type}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a bit more info here too
|
Thanks for the PR @gabeiglio and thanks for the review @jayceslesar @Anton-Tarazi @geruh |
Closes #2166
Rationale for this change
We should block when an user wants to drop a column if that column is being referenced by either a active partition spec or sort order field.
Are these changes tested?
Yes, I added unit tests for every incompatible schema change in partitions and sort orders. Also added two new integration tests in
test_catalogto test for this scenarioAre there any user-facing changes?
No